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FLUOROPHORE COMPLEMENTATION PRODUCTS 



Mocftaget PVS 
-2JUL12QQ2 



Field of invention 

The present invention relates to various split fluorophore complementation products, 
especially ways to obtain intense systems with GFP. 

5 Background of the invention 

It has been suggested to use the reassembly of certain enzyme fragments of the 
complete enzyme as a measure of protein-protein interactions. Johnsson and Varshavsky 
(Johnsson, N., Varshavsky, A. (1994) Proc. NatL Acad. Sci. U. S. A. 91, 10340-10344) 
disclose reassembly of Ubiquitin. This reassembly is detected through the irreversible 
10 cleavage of the fusion by Ubiquitin protease and release of a reporter. As opposed to the 
two-hybrid technique, this technique includes the possibility of monitoring a protein-protein 
interaction as a function of time, at the natural sites of this interaction in a living cell. 

Similar systems are suggested for the reassembly of other proteins including p- 
galactosidase (Rossi, R, Charlton, C.A., Blau, H.M. (1997) Proc. Natl. Acad. Sci. U. S. A. 

15 94, 8405-8410), dihydrofolate reductase (DHFR, WO98/34120). and p-lactamase 

(Wehrman, T.. Kleaveland, B. f Her, J.H., Balint, R.F., Blau, H.M. (2002) Proc. Natl. Acad. 
Sci. U. S. A. 99, 3469-3474). The basic concept is that by splitting a functional protein in 
two fragments, the function is lost The two fragments are transformed or transfected into 
cells fused in frame to proteins X and Y, respectively. Binding between proteins X and Y 

20 will bring the two fragments close together, increasing the local concentration of the 
complementing fragments, induce folding of these fragments and produce a functional 
protein with an activity that is similar to that of the non-fragmented protein. If the function 
is DHFR activity, the cells will survive only if proteins X and Y bind to each other. 

Recently, it has been described to use a somewhat similar system for the assisted 
25 reassembly and folding of fragments of fluorescent proteins. As the function is 

fluorescence, the cells will emit light upon excitation only if protein X and protein Y bind to 
each other thereby assisting complementation. 
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Ghosh (I. Ghosh, A.D. Hamilton, L Regan (2000) J. Am. Chem. Soc. 122, 5658-5659, 
WO01/87919) describes the use of a GFP variant called sgtOO (F64L, S65C, Q80R, 
Y151L, I167T and K238N). This GFP has single fluorescence excitation and emission 
peaks at 475 nm and 505 nm, respectively (similar to sg25 described by Palm (Palm, 
5 G.J., Zdanov, A., Gaitanaris, G.A., Stauber, R.. Pavlakis, G.N., Wlodawer, A. (1997) Nat 
Struct Biol. 4, 361-365)). 

Functional GFP fragment complementation is accomplished by co-expressing two 
independent peptides composed of the first 157 N-terminai amino acids of this GFP 
(NGFP) and the remaining 81 C-terminal amino acids (starting form residue 158) of this 
10 GFP (CGFP) with each of the GFP peptide fragments being fused to interacting leucine 
zipper peptides that serve to associate the fragments. 

Nagai (T. Nagai, A. Sawano, E.S. Park, A. Miyawaki (2001) Proc. Natl. Acad. U. S. A. 98, 
3197-3202) tests a yellow fluorescent GFP variant that has the following mutations: S65G, 
V68L, Q69K, S72A, T203Y. This variant was split between residues N144 and Y145 
15 within the open 129-145 loop region, and the peptides fused to M13 and calmodulin, 
respectively, for use in a Ca 2+ assay. However, when the constructs were transfected 
individually into HeLa cells, the assay was not reliable. 

Hu (Chang-Deng Hu, Yurii Chinenov, and Tom Kerppola (2002) Mol. Cell 9, 789-798) 
enables the use of the GFP variant called EYFP (S65G, S72A, T203Y). The two GFP 
20 fragments which are described to work are the first 1 54 amino acids of GFP in one 
construct and amino acids 155 through 238 in the other construct- 
However, there is still a need for alternative GFPs for use in this technology. 

Summary of the invention 

The present application discloses that certain GFPs can be reassembled and form a 
25 functional fluorescent protein when expressed as two independent proteins. For example, 
when EGFP is expressed in mammalian cells, choosing a split site located in a loop 
region between the residues that form the beta-sheet structures of the GFP beta-barrel 
results in intense fluorescence (Example 5 and Example 7). The present application 
further illustrates that EYFP is also reassembled and, surprisingly, the fluorescence from 
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the reassembled protein is markedly enhanced if it contains the F64L mutation (Example 
9). 

Detailed disclosure 

The non-fluorescent fragments of fluorescent proteins that can be combined to form one 
5 functional fluorescent unit are usually produced by splitting the coding nucleotide 
sequence of one fluorescent protein at an appropriate site and expressing each 
nucleotide sequence fragment independently. The fluorescent protein fragments may be 
expressed alone or in fusion with one or more protein fusion partners. 

Thus, one aspect of the invention relates to two GFP fragments comprising an N-terminal 
10 fragment of GFP, comprising a continuous stretch of amino acids from amino acid number 
1 to amino acid number X of GFP, wherein the peptide bond between amino acid number 
X and amino acid number X+1 is within a loop of GFP, the two GFP fragments also 
comprise a C-terminal fragment of GFP, comprising a continuous stretch of amino acids 
from amino acid number X+1 to amino acid number 238 of GFP. 

1 5 Amino acid 1 is meant to indicate the first amino acid of GFP. Amino acid 238 is meant to 
indicate the last amino acid of the GFP. 

All residues are numbered according to the numbering of wild type A, victoria GFP 
(GenBank accession no. M62653) and said numbering also applies to equivalent 
positions in homologous sequences exemplified by alignment of fluorescent protein 
20 sequences in Example 1 . Thus, when working with truncated GFPs (compared to wild 
type GFP) or when working with GFPs with additional amino acids, the numbering is 
relative to the alignment. 

Green Fluorescent Protein (GFP) is a 238 amino add long protein derived from the 
25 jellyfish Aequorea Victoria (SEQ ID NO: 1). However, fluorescent proteins have also been 
isolated from other members of the Coelenterata, such as the red fluorescent protein from 
Discosoma sp. (Mate, M.V, etal. 1999, Nature Biotechnology 17: 969-973), GFP from 
Renilla reniformis, GFP from Renilla Muelleri or fluorescent proteins from other animals, 
fungi or plants. The GFP exists in various modified forms including the blue fluorescent 
30 variant of GFP (BFP) disclosed by Heim et al. (Heim, R. et al % 1994, Proc.NatlJtaad.Sci. 
91 :26. pp 1 2501-1 2504) which is a Y66H variant of wild type GFP; the yellow fluorescent 
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variant of GFP (YFP) with the S65G, S72A, and T203Y mutations ( WO98/06737); the cyan 
fluorescent variant of GFP (CFP) with the Y66W colour mutation and optionally the F64L, 
S65T. N146I, M153T, V163A folding/solubility mutations (Heim, R., Tsien, R.Y. (1996) Curr. 
Biol. 6, 1 78-1 82). The most widely used variant of GFP is EGFP with the F64L and S65T 
5 mutations (WO 97/1 1 094 and W096/2381 0) and insertion of one valine residue after the 
first Met. The F64L mutation is the amino acid in position 1 upstream from the chromophore. 
GFP containing this folding mutation provides an increase in fluorescence intensity when the 
GFP is expressed in cells at a temperature above about 30°C. 

It is known that fluorescence in wild-type GFP is due to the presence of a chromophore, 
1 0 which is generated by cyclisation and oxidation of the SYG at position 65-67 in the predicted 
primary amino acid sequence and presumably by the same reasoning of the SHG sequence 
and other GFP analogues at positions 65-67. 

The present examples clearly illustrate how the fluorescence intensity from a reassembled 
protein is enhanced in GFPs containing the F64L mutations as compared to GFPs without 
1 5 this mutation. Thus, it is preferred that the GFP contains the F64L mutation, either by 
electing a GFP with this mutation (e.g. EGFP) or to introduce this mutation into the GFP of 
choice (e.g. YFP as illustrated in Example 8). 

In the nomenclature of GFP, an "E" is placed in front of the GFP (EGFP. EYFP. ECFP) to 
indicate that this particular GFP is encoded by a nucleic acid with codon usage optimised for 
20 mammalian cells. Most of these proteins also have an extra valine residue inserted after the 
initial methionine residue, Met 1 . This residue is not considered in the numbering of the 
residues. Thus, in a preferred embodiment, the GFP of the present invention is selected 
from the group consisting of EGFP. EYFP. ECFP, dsRed and Renilla GFP. 

Some of the examples of the present application, EGFP is used. Thus, in a preferred 
25 embodiment of the invention, the GFP is EGFP. However, Example 8 and Example 9 
show that EYFP has certain advantages. Thus, in another preferred embodiment of the 
invention, the GFP is EYFP. 

In the present context, the numbering of wild-type GFP (SEQ ID NO: 1) (Chalfie. M.. Tu. 
30 Y., Euskirchen, G., Ward. W.W., Prasher. D.C. (1994) Science 263, 802-805, this variant 
of GFP has a histidine residue In position 231) is used. Based on the crystal structure of 
GFP (Yang, F., Moss, L.G., Phillips, G.N. (1996) Nat Biotech. 14. 1246-1251) Figure 5. 
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Table 1 and the data presented in the examples, it is evident that a split in almost any 
loop will be re-assembled following appropriate spatial approximation to the 
complementation fragments assisted by the interaction of the conjugated proteins. For the 
purpose of this application the term "loop" shall be understood as a turn or element of 
5 irregular secondary structure. 
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Thus, in one aspect, the invention relates to two GFP fragments as described above, 
wherein X is 7, 8, 1 1 or 12, preferably X is 9 or 10 within the Thr9-Val1 1 loop; or 
wherein X is 21 , 22, 25 or 26, preferably X is 23 or 24 within the Asn23-His25 loop; or 
wherein X is 36, 37, 40 or 41 , preferably X is 38 or 39 within the Thr38-Gly40 loop; or 
5 wherein X is 46, 47, 56 or 57, preferably X is between 48 and 55 i.e. X is 48, 49, 50, 51 , 

52, 53, 54 or 55 within the Cys48-Pro56 loop; or 
wherein X is 70, 71 , 76 or 77, preferably X is between 72 and 75 i.e. X is 72, 73, 74 or 75 

within the Ser72-Asp76 loop; or 
wherein X is 79, 80, 83 or 84, preferably X is 81 or 82 within the His81-Phe83 loop; or 
10 wherein X is 86, 87, 90 or 91, preferably X is 88 or 89 within the Met88-Glu90 loop; or 
wherein X is 99, 100, 103 or 104, preferably X is 101 or 102 within the Lys101-Asp103 

loop; or 

wherein X is 1 1 2, 1 13, 1 1 8 or 1 1 9, preferably X is between 1 14 and 1 17 i.e. X is 1 14, 1 1 5, 
1 16 or 1 17 within the Phe1 14-Thr1 18 loop; or 
15 wherein X is 126. 127, 145 or 146, preferably X is between 128 and 144 i.e. X is 128, 129. 
130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144 within the 
lie 128-Tyr145 loop; or 
wherein X is 152, 153, 160 or 161, preferably X is between 154 and 159 i.e. X is 154, 155. 
156, 157, 158 or 159 within the Ala154-Gly160 loop; or 
20 wherein X is 169, 170, 175, 176, preferably X is between 171 and 174 i.e. X is 171. 172. 
173 or 174 within the Ile171-Ser175 loop; or 
wherein X is 186. 187, 197 or 198, preferably X is between 188 and 196 i.e. X is 188, 189, 

190. 191. 192. 193, 194, 195 or 196 within the Ile188-Asp197 loop; or 
wherein X is 208, 209, 215 or 216, preferably X is between 210 and 214 i.e. X is 210, 21 1, 
25 212, 213 or 214 within the Asp210-Art215 loop. 

Table 1 GFP secondary structures, GFP wild type sequence amino acid numbering, a 
and 0 indicate a-helical and 0-sheet secondary structures, respectively. 



Name 


Position 




Helix 1 


Lys3 - Thr9 


<x1 


Sheet 1 


Val11 -Asn23 


01 


Sheet 2 


His25 - Thr38 


P2 


Sheet 3 


Gly40 - Cys48 


P3 


Helix 2 


Pro56 - Ser72 


a2 


Helix 3 


Asp76 - His81 


«X3 
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Helix 4 


Phe83-Met88 


a4 


Sheet 4 


Glu90- Lys 101 


P4 


Sheet 5 


Asp103-Phe114 


P5 


Sheet 6 


Thr118-lle128 


P6 


Sheet 7 


Tyr145-Ala154 


P7 


Sheet 8 


Gly160-lle171 


P8 


Sheet 9 


Ser175 - Ile188 


P9 


Sheet 10 


Asp 197- Asp210 


pio 


Sheet 1 1 


Arg215-Gly228 


P11 



Based on the findings disclosed in the examples, it is concluded that appropriate splitting 
sites in GFP are located in the loop regions between the residues that form the beta-sheet 
structures of the GFP beta-barrel. Accordingly, splits in GFP are preferably made in the 
5 Asn23-His25 loop, the Thr38-Gly40 loop, the Lys1 01 -Asp1 02 loop, the Phe1 1 4-Thr1 1 8 
loop, the Ile128-Tyr145 loop, the Aia154-Gly160 loop, the Ile171-Ser175 loop, the Ile188- 
Aspl97 loop or the Asp210-Arg215 loop (Table 1, Figure 5). 

The data in the present examples illustrates clearly that the Ala154-Gly160 loop is very 
well suited for GFP reassembly. This is particularly the case when the GFP is divided 
10 between amino acids Q157 and K1 58 (that is, when X is 157). Thus, a preferred 

embodiment of the invention relates to two GFP fragments, wherein X is 157 within the 
AIa154-GIy160 loop. 

The data In the present examples also illustrate that the IIe171-Ser175 loop is very useful 
for GFP reassembly. This is particularly the case, when the GFP is divided between 
15 amino acids E172 and D173 (that is, when X is 172). Thus, a preferred embodiment of the 
invention relates to two GFP fragments, wherein X is 172 within the Ile171-Ser175 loop. 

As illustrated in Example 9, fragments having overlapping sequences are also functional. 
Thus one aspect of the invention relates to two GFP fragments comprising 
20 (a) an N-terminal fragment of GFP, comprising a continuous stretch of amino adds from 
amino acid number 1 to amino acid number X of GFP, wherein the peptide bond between 
amino acid number X and amino acid number X+1 is within a loop of GFP and 
(b) a C-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
amino add number Y to amino add number 238 of GFP, wherein Y<X creating an overlap 
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of the two GFP fragments, and wherein the peptide bond between amino acid Y-1 and 
amino acid Y is within a loop of GFP. 

These overlapping GFP fragments are very attractive in e.g. functional cloning systems 
where highly flexible linkers sequences are required due to the very diverse nature of the 
5 fusion partners. The overlapping fragments permit either of the fusion partners to have a 
long linker sequence. 

For the purposes of deciding the nature of the Y in the C-terminal fragment of GFP 
defined above, the same considerations as discussed for the value of X applies. 

1 0 In order to obtain reassembly of the two halves of GFP, it is essential to have the two 
halves of GFP fused to interaction partners that will bring said two halves of GFP so close 
together that the protein halves will fold and form functional GFP (see Hu 2002 supra). 
Thus, a preferred embodiment of the invention relates to a fusion protein comprising an N- 
terminal fragmentof GFP as described above conjugated to a first protein of interest. In a 

15 particular embodiment the N-terminal fragment of GFP is fused in frame to the first protein 
of interest. In similar embodiments, the present invention relates to two GFP fragments as 
described above, wherein the C-terminal fragment of GFP Is conjugated to a second 
protein of interest. In a particular embodiment, the C-terminal fragment of GFP is fused in 
frame to the second protein of interest. 

20 As will be evident to the skilled person, the protein of interest is fused to the GFP 

fragment in the N-terminal or in the C-terminal. However, as illustrated in the examples, 
fusion of the first protein of interest to the N-terminal fragment of GFP shall preferably be 
to the C-terminal of the N-terminal fragment of GFP. Likewise, fusion of the second 
protein of interest to the C-terminal fragment of GFP shall preferably be to the N-terminal 

25 of the C-terminal fragment of GFP. 

As will be evident from the present examples, and e.g. Hu (2002 supra), the protein of 
interest is a protein, a peptide or a non-proteinaceous partner. 



30 



In a typical embodiment of the invention, the fusion protein as described above, wherein 
the fragment of GFP is fused in frame to a protein of interest further comprises a linker 
sequence between either fragment of GFP and the corresponding protein of Interest 
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The linker must be chosen dependent on the protein of interest conjugated to the 
fragment of GFP. Thus the linker must be flexible. A long linker prevent steric hindrance of 
the complementation due to the protein of interest. However short linkers keeps the 
fragments of GFP closer to each other and gives better associations. 

5 

The present invention also relates to the N-terminal fragment of GFP as described above. 
In a similar embodiment, the invention relates to the C-terminal fragment of GFP as 
described above^ 

10 A preferred embodiment of the invention relates to a Nucleic acid encoding any of the 
fragments or fusions proteins described above. In one embodiment, the nucleic acid 
construct encoding any of the proteins according to the invention described above is a 
DNA construct In another embodiment, the nucleic acid construct encoding any of the 
proteins according to the invention described above is a RNA construct. 

15 

One aspect of the invention relates to a cell containing the two GFP fragments described 
above. In similar embodiments, the invention relates to a cell containing the N-terminal 
fragment of GFP described above. In similar embodiments, the invention relates to a ceil 
containing the C-terminal fragment of GFP described above. 

20 Numerous cell systems for transfection exist. A few examples of mammalian cells isolated 
directly from tissues or organs taken from healthy or diseased animals (primary cells), or 
transformed mammalian cells capable of indefinite replication under cell culture conditions 
(cell lines). The term "mammalian cell" is intended to indicate any living cell of mammalian 
origin. The cell may be an established cell line, many of which are available from The 

25 American Type Culture Collection (ATCC, Virginia, USA) or similar Ceil Culture 
Collections. The cell may be a primary cell with a limited life span derived from a 
mammalian tissue, including tissues derived from a transgenic animal, or a newly 
established immortal cell line derived from a mammalian tissue including transgenic 
tissues, or a hybrid cell or cell line derived by fusing different cell types of mammalian 

30 origin e.g. hybridoma cell lines. The cells may optionally express one or more non-native 
gene products, e.g. receptors, enzymes, enzyme substrates, prior to or in addition to the 
fluorescent probe. Preferred cell lines include, but are not limited to, those of fibroblast 
origin, e.g. BHK, CHO, BALB, NIH-3T3 or of endothelial origin, e.g. HUVEC, BAE (bovine 
artery endothelial), CPAE (cow pulmonary artery endothelial), HLMVEC (human lung 

35 micro vascular endothelial cells), or of airway epithelial origin, e.g. BEAS-2B, or of 
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pancreatic origin, e.g. RIN. INS-1, MIN6, bTC3 f aTC6, bTC6, HIT, or of hematopoietic 
origin, e.g. primary isolated human monocytes, macrophages, neutrophils, basophils, 
eosinophils and lymphocyte populations, AMU 4, AML-193, HL-60, RBL-1, U937, RAW, 
JAWS, or of adipocyte origin, e.g. 3T3-L1, human pre-adipocytes, or of neuroendocrine 
5 origin, e.g. AtT20, PC12, GH3, muscle origin, e.g. SKMC, A10, C2C12, renal origin, e.g. 
HEK 293, LLC-PK1, or of neuronal origin, e.g. SK-N-DZ, SK-N-BE(2), HCN-1A. NT2/D1. 

One aspect of the invention relates to a method for detecting the interaction between two 
proteins of interest comprising the steps of: 

10 (a) providing at least one cell that contains two heterologous conjugates, 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP as described above, 

the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP as described above; and 
1 5 (b) measuring the fluorescence from the at least one cell, 

fluorescent cells indicating interaction between the two proteins of interest 

In a similar embodiment, the invention relates to a method for monitoring the interaction 
between two proteins of interest comprising the steps of: 

(a) providing at least one cell containing at least one stretch of nucleic acid encoding two 
20 heterologous conjugates: 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP as described above, 

the second heterologous conjugate comprising a second protein of interest conjugated 

to a C-terminal fragment of GFP as described above; 
25 (b) culturing the at least one cell under conditions allowing expression; and 
(c) measuring the fluorescence from the at least one cell, 
fluorescent cells indicating interaction between the two proteins of interest. 

In one aspect of the methods, one of the proteins of interest is known, whereas the other 
protein of interest is an unknown protein. By parallel transfection the cells with interaction 
30 partners to the know protein of interest will be fluorescent and thereby easily detectable. 

In a preferred aspect of the methods, the at least one cell is a mammalian cell. 
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In another preferred aspect of the methods, the heterologous conjugates are fusion 
proteins. 

This technology has broad applicability. Due to the direct detection of interactions is can 
5 be used in genomics and proteomics. The high sensitivity makes is applicable to target 
discovery and the high specificity makes it applicable to target validation. It can be scaled 
to Drug Discovery in High Throughput Screening. The technology is quantitative and 
makes it applicable to nanotechnology. 

The invention will be illustrated more specifically in the following non-limiting examples. 



10 
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Examples 



Example 1: Alignment of fluorescent proteins 



GenBank entry 


Fluorescent protein 


P42212 


Aequorea victoria green-fluorescent protein 


AF372525 


Renilla reniformis oreen fluorescent Drotein 


AY015996 


Renilla muelleri areen fluorescent orotein 


AY013824 


Aeauorea macrodactvla isolate GFPxm 


AF384683 


Montastraea cavernosa areen fluorescent nrotpin 


AF401282 


Montastraea faveolata green fluorescent protein 


AY015995 


Ptilosarcus sp. CSG-2001 green fluorescent protein 


AF322221 


Anemonia sulcata green fluorescent protein asFP499 


AF322222 


Anemonia sulcata nonfluorescent red protein asCP562 


AF246709 


Anemonia sulcata GFP-like chromoprotein FP595 


AF168419 


DsRed Discosoma sp. fluorescent protein FP583 


AF1 68420 


Discosoma striata fluorescent protein FP483 


AF168421 


Anemonia majano fluorescent protein FP486 


AF 168422 


Zoanthus sp. fluorescent protein FP506 


AF 168423 


Zoanthus sp. fluorescent protein FP538 


AF1 68424 


Ciavularia sp. fluorescent protein FP484 
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P42212 avGFP MS KGEELFTGWPI LVELDGD V 22 

AY015996 rmGFP MSKQILKNTCLQEVMSYKVNLEGIV 25 

AF3 72525 rrGFP MD LAKLGLKEVMPT KI NLE GI*V 22 

AF168419 dsRed MRSSKNVI KEFMRFKVRMEGTV 22 

AF322222 asCP562 MAS FLKKTMPFKTTI EGTV 19 

AF168422 asCP562 - MAQSKHGLTKEMTMKYRMEGCV 22 

AF246709 asFP595 - MAS FLKKTMPFKTTI EGTV 19 

AF322221 asFP499 - -MYPSIKETMRVQLSMEGSV 19 

AF384683 mcGFP MSVIKPIMEIKLRMQGW 18 

AF4 01282 mfGFP - - - - MS VI KPDMKI KLRMEGAV 18 

AF168424 CSFP4 84 MKCKFVFCLSFLVLAITNANI FLRNEADLEEKTLRI PKAIiTTMGVI KPDMKI KLKMEGNV 60 

AF168420 dsFP483 MS CS KS V I KEEML I DLHLEGT F 22 

AY015995 spGFP - MNRNVLKNTGLKEIMSAKASVEGIV 25 

AF168423 zsFP538 MAHSKHGLKEEMTMKYHMEGCV 22 

AF168421 amFP486 MALSNKFIGDDMKMTYHMDGCV 22 

AY013824 amGFPxm MSKGEELFTGIVPVLIELDGDV 22 

: : : :* . 

P42212 avGFP NGHKFSVSGEGEGDATYGKL- -TLKFICTTG- KLPVPWPTLVTTFSYGVQCFSRYPDHMK 79 

AY015996 rmGFP NNHVFTMEGCGKGNILFGNQ--LVQIRVTKGAPLPFAFDIVSPAFQYGNRTFTKYPNDIS 83 
AF372525 rrGFP GDHAFSMEGVGEGNI LEGTQ - - EVKI SVTKGAPLPFAFDI VSVAFS YGNRAYTGYPEEIS 80 
AF168419 dsRED NGHEFEIEGEGEGRPYEGHN- -TVKLKVTKGGPLPFAWDILSPQFQYGSKVYVKHPADIP 80 
AF322222 asCP562 NGHYFKCTGKGEGNPFEGTQ- -EMKIEVIEGGPLPFAFKILSTSCMYGSKTFIKYVSGIP 77 
AF168422 asCP562 DGHKFVI TGEGIGYPFKGKQ - - AINLCWEGGPLPFAEDIIiSAAFNYGNRVFTEYPQDIV 80 
AF246709 asFP595 NGHYFKCTGKGEGNPFEGTQ- -EMKIEVIEGGPLPFAFHILSTSCMYGSKTFIKYVSGIP 77 
AF322221 asFP499 NYHAFKCTGKGEGKPYEGTQ- -SLNITITEGGPLPFAFDILSHAFQYGIKVFAKYPKEI P 77 
AF384683 mcGFP NGHKFVI KGEGEGKPFEGTQ- -TINLTVKEGAPLPFAYDIliTSAFQYGNRVFTKYPDDIP 76 
AF401282 mfGFP NGHKFVI EGDGKGKPFEGTQ- -SMDLTVKEGAPLPFAYDI LTTVFDYGNRVFAKYPQDI P 76 
AF168424 CSFP484 NGHAFVIEGEGEGKPYDGTH--TLNLEVKEGAPLPFSYDIIiSNAFQYGNRAIjTKYPDDIA 118 
AF168420 dsFP483 NGHYFEI KGKGKGQPNEGTN- -TVTLEVTKGGPLPFGWHILCPQFQYGNKAFVHHPDNIH 80 
AY015995 spGFP NNHVFSMEGFGKGNVLFGNQ- - LMQIRVTKGGPLPFAFDIVSIAFQYGNRTFTKYPDDIA 83 
AF168423 ZSFP538 NGHKFVI TGEGIGYPFKGKQ - - TINLCVI EGGPLPFSEDILSAGFKYGDRI FTEYPQDI V 80 
AF168421 amFP486 NGHYFTVKGEGNGKPYEGTQTSTFKVTMANGGPLAFSFDILSTVFKYGNRCFTAYPTSMP 82 
AY013824 amGFPXM HGHKFSVRGEGEGDADYGKL- - EIKFICTTG - KLPVPWPTLVTTFS YGIQCFARYPEHMK 79 
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P42212 avGFP QHDFFKSAMPEGYVQBRTIFFKDDGNYKTRAEV- -KFEG- - -DTLVNRIELKGIDFKEDG 134 

AY015996 rmGFP - - D YF IQS FPAGFM YERTLRYEDGGLVEI RSDI - - NLI E DKFV YRVE YKGSNF PDDG 136 

AF37252S rrGFP - -DYFLQSFPEGFTYERNIRYQDGGTAIVKSDI- -SLED- - -GKFIVNVDFKAKDLRRMG 133 

5 AF168419 dsRED - -DYKKLSFPEGFKWERVMNFEDGGWTVTQDS- -SLQD GCFIYKVKFIGVNFPSDG 133 

AF322222 asCP562 --DYFKQSFPEGFTWERTTTYEDGGFLTAHQDT- -SLDG- --DCLVYKVKILGNNFPADG 130 

AF168422 asCP562 - - DYF KNSCPAGYTWDRSFLFEDGAVCI CNADITVSVEEN CMYHESKFYGVNFPADG 135 

AF246709 asFP595 - - DYFKQS F PEGFTWERTTTYEDGG FLTAHQDT - - SLDG DCLVYKVKI LGNNFPADG 130 

AF322221 asFP4 99 - -DFFKQSLPGGFSWERVSTYEDGGVLSATQET--SLQG DCIICKVKVLGTNFPANG 130 

10 AF384683 mcGFP - - DYFKQTFPEGYSWERIMAYEDQSICTATSDI - - KMEG DCFIYEIQFHGVNFPPNG 129 

AF401282 mfGFP - -DYFKQTFPEGYSWERSMTYEDQGICVATNDI - - TLMKGVDDCFVYKIRFDGVNFPANG 132 

AF168424 CSFP484 - -DYFKQSFPEGYSWERTMTFEDKGIVKVKSDI- -SMEE DSFIYEIRFDGMNFPPNG 171 

AF1684 20 d3FP4 83 - -DYLKLSFPEGYTWERSMHFEDGGLCCITNDI- -SLTG NCFYYDI KFTGLNFPPNG 133 

AY015995 SpGFP - -DYFVQSFPAGFFYERNLRFEDGAI VDIRSDI - - SLED DKFHYKVEYRGNGFPSNG 136 

15 AF168423 ZSFP538 - - DYFKNSCPAGYTWGRS FLFEDGAVCI CNVDI TVSVKEN CI YHKS I FNGMNFPADG 135 

AF168421 amFP486 - -DYFKQAFPDGMSYERTFTYEDGGVATASWEI - - SLKGN CFEHKSTFHGVNFPADG 135 

AY013824 amGFPXM MNDFFKSAMPEGYIQERTIFFQDDGKYKTRGEV- - KFEG DTLVNRIELKGMDFKEDG 134 

20 P42212 avGFP NI LGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQN- -TPIGDGP 192 

AY015996 rmGFP PVM - QKTILGIEPSFEAMYMN- - NGVLVGEVI LVYKLNSGKYYSCHMKTL MKSKGW 190 

AF372525 rrGFP PVM-QQDIVGMQPSYESMYTN- - VTSVIGECI IAFKLQTGKHFTYHMRTV- - - YKSKKPV 187 

AF168419 dsRED PVM - QKKTMGW EASTERLY PR- -DGVLKGEIHKALKLKDGGHYLVEFKSI YMAKKP- 186 

AF322222 asCP562 P RDAEQS- 137 

25 AF168422 asCP562 PVM - KKMTDNWEPSCEKI X PVPKQGILKGDVSMYLLLKDGGRLRCQFDTV YKAKSVP 191 

AF246709 asFP595 PVM- QNKAGRWEPATEIVYEV- -DGVLRGQSLMALKCPGGRHLTCHLHTTYRSKKPASA- 186 

AF322221 asFP499 PVM - QKKTCGWEPSTETVIPR- -DGGLLLRDTPALMLADGGHLSCFMETT YKSKKE- 183 

AF384683 mcGFP PVM-QKKTLKWEPSTEKMYVR- -DGVLKGDVNMALLLEGGGHYRCDFRST YKAKKR- 182 

AF401282 mfGFP PVM-QKKTLKWEPSTEKMYVR- -DGVLKGDVNMALLLEGGGHYRCDFKTT YKAKKF- 185 

30 AF168424 CSFP484 PVM-QKKTLKWEPSTEIMYVR- -DGVLVGDISHSLLLEGGGHYRCDFKSI YKAKKV- 224 

AF168420 dsFP483 PW-QKKTTGWEPSTERLYPR- -DGVLIGDIHHALTVEGGGHYACDIKTV- - - YRAKKAA 187 

AY015995 SpGFP PVM- QKAI LGMEPSFEW YMN- - SGVLVGEVDLVYKLESGNYYSCHMKTF YRS KGGV 190 

AF168423 2SFP538 PVM- KKMTTNWEASCEKIMPVPKQGILKGDVSMYLLLKDGGRYRCQFDTV YKAKSVP 191 

AF168421 amFP486 PVM-AKKTTGWDPSFEKMTVC- - DGILKGDVTAFLMLQGGGNYRCQFHTS- - -YKTK-KP 188 

35 AY013824 amGFPXM NILGHKLEYNFNSHNVYIMPDKANNGLKVNFKIRHNIEGGGVQLADHYQTN- -VPLGDGP 192 



P42212 avGFP VLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK 238 

AY015996 rmGFP KEFPSYHFIQHRLEKTYV- EDGG - FVEQHETAIAQMTSIGKPLGSLHEWV 238 

40 AF372525 rrGFP ETMPLYHFIQHRLVKTNV-DTASGYWQHETAIAAHSTIKKIEGSLP- - - 233 

AF168419 dsRED VQLPGYYYVDSKLDITSH-NEDYTIVEQYERTEGRHHLFL 225 

AF322222 asCP562 RKMG ASHRDTL 148 

AF168422 asCP562 RKMPDWHFIQHKLTREDRSDAKNQKWHLTEHAIASGSALP 231 

AF246709 asFP595 LKMPG FHFEDHRI E IMEE - VEKG KCYKQ YEAAVGRYCDAAPS KLGHN 232 

45 AF322221 asFP499 VKLPELHFHHLRMEKLNI -SDDWKTVEQHESWASYS - QVPS KLGHN 228 

AF384683 mcGFP VQLPDYHFVDHRI EILSH- DNDYNTVKLSEDAEARYSMLPSQAK 225 

AF401282 mfGFP VQLPDYHFVDHRI EILSH- DKDYNKVKL YEHAEA- HSGLPRQAK 227 

AF168424 CSFP4 84 VKLPDYHFVDHRI EI LNH-DKDYNKVTL YEN AVAR YSLLPSQA 266 

AF168420 dsFP483 LKMPGYHYVDTKLVIWNN-DKEFMKVEEHEIAVARHHPFYEPKKDK 232 

50 AY015995 SpGFP KEFPEYHFI HHRLEKTYV- EEGS - FVEQHETAI AQLTTIGKPLGSLHEWV 238 

AF168423 ZSFP538 SKMPEWHFIQHKLLREDRSDAKNQKWQLTEHAIAFPSALA 231 

AF168421 amFP486 VTMPPNHWEHRI ARTDLDKGGNS - VQLTEHAVAH ITS WPF 229 

AY013824 amGFPXM VL I P INHYLSTQTAI S KDRNETRDHMVFLEFFS ACGHTHGMDELY K 238 

: : 



1016DK3 



Page 15 of 44 



Example 2: Construction of GFP complementation fragment probes 

Different fragments of EGFP fused to anti-parallel leucine zippers (called NZ and CZ) that 
can bind to each other within prokaryotic and eukaryotic cells were used to evaluate the 
optimal site for splitting EGFP for use of such fragments in molecular complementation 
5 experiments, including bimolecular fluorescence complementation experiments. NZ and 
CZ leucine zippers were prepared by annealing and ligating phosphorylated oligo 
nucleotides 2110-2115 (for NZ zipper, see Table 2) or phosphorylated oligo nucleotides 
21 16-2121 (for CZ zipper), into Ncol-BamHI cut pTrcHis-A vector (commercially available 
from Invitrogen) producing vector PS1515 (expression vector encoding NZ zipper) or 

10 PS1516 (expression vector encoding CZ zipper). The oligos ligated in NZ and CZ 

annealing mixes 1 produced the coding sequences of the N-terminal parts of the NZ and 
CZ zippers. The oligos ligated in NZ and CZ annealing mixes 2 produced the coding 
sequences of the middle parts of the NZ and CZ zippers and the oligos ligated in NZ and 
CZ annealing mixes 3 produced the coding sequences of the C-terminal parts of the NZ 

15 and CZ zippers. 

Annealing primer pairs for NZ zipper 



NZ annealing mix 1 

Forward oligo 2110(1 \iM) 5 \i\ 

Reverse oligo 2111(1 nM) 5 

50 mM Tris-HCI, 10 mM MgCI 2 , pH 8.0 2 ^il 

H 2 0 8 ix\ 

NZ annealing mix 2 

Forward oligo 21 12 (1 jiM) 5 \i\ 

Reverse oligo 21 13 (1 (iM) 5 \d 

50 mM Tris-HCI, 10 mM MgCI 2 , pH 8.0 2 \x\ 

H 2 Q 8 |xl 
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NZ annealing mix 3 



Forward oligo 21 14 (1 pM) 5 pi 

Reverse oligo 21 15 (1 pM) 5 pi 

50 mM Tris-HCI, 10 mM MgC! 2 , pH 8.0 2 pi 

H 2 0 8 pi 



Each of the annealing mixes was heated at 80°C for 2 minutes on a pre-heated Hybaid 
OmniGene PCR machine which was subsequently turned off and allowed to cool to room 
5 temperature (about 10 min). The fragments were subsequently put on ice. 

Annealing primer pairs for CZ zipper 
CZ annealing mix 1 

Forward oligo 21 16 (1 pM) 5 pi 

Reverse oligo 21 17 (1 pM) 5 pi 

50 mM Tris-HCI, 10 mM MgCI 2l pH 8.0 2 pi 
H 2 0 8 pi 

CZ annealing mix 2 

Forward oligo 2118 (1 pM) 5 pi 

Reverse oligo 2119(1 jiM) 5 pi 

50 mM Tris-HCI, 10 mM MgCI 2l pH 8.0 2 pi 

H 2 0 8 pi 

0 

CZ annealing mix 3 

Forward oligo 2120 (1 pM) 5 pi 

Reverse oligo 2121 (1 pM) 5 pi 

50 mM Tris-HCI, 10 mM MgCI 2l pH 8.0 2 pi 

H z O 8 pi 
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Each of the annealing mixes was heated at 80°C for 2 minutes on a pre-heated Hybaid 
OmnlGene PCR machine which was subsequently turned off and allowed to cool to room 
temperature (about 10 min). The fragments were subsequently put on ice. 

Restriction digestion ofpTrcHi$-A prokaryotic expression vector 

5 The pTrcHis-A prokaryotic expression vector cut with Ncol and BamHI restriction 
enzymes and gel purified was used for cloning the prepared NZ and CZ leucine zipper 
coding sequences: 



Restriction digestion of pTrcHis-A vector 

pTrcHis-A (1 jig/jil) 2 \x\ 

Ncol(10U/nl) 1 

Nhel (5 U/jil), optional 0.5 \i\ 

BamHI (20 U/|il) 1 

100x BSA 0.4 jLii 

10x NEB (New England Biolabs, NEB) BamHI buffer 3 jil 

H 2° 23 Ml 



Calf intestinal phosphatase (optional, last 20 min only) 0.5 mI 

10 The vector was digested for about 1 hour at 37°C and purified by agarose gel 
electrophoresis. The desired vector fragment was recovered from the gel using the 
QIAquick Gel Extraction kit (spin columns) from Qiagen and recovered in 50 \i\ of elution 
buffer. Nhel, which cuts between Ncol and BamHI, was included to minimise the amounts 
of uncut and self-ligating vector. 

1 5 Ligation and transformation of annealed NZ oligo pairs 

Each of the three NZ annealing mixtures 1-3 was diluted 50-fold (1 \x\ of mixture in 50 nl of 
H 2 0) and mixed and ligated into the cut vector as follows (three hours at 20-24°C): 
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Ligation of NZ zipper fragments into pTrcHis-A vector 



Annealing mix 1 1^1 

Annealing mix 2 i ^ 

Annealing mix 3 \ ^\ 

10x T4 DNA Iigase buffer (New England Bioiabs) 1 ^| 

T4 DNA Iigase (400 U/^l, New England Bioiabs) 0.5 \x\ 

pTrcHis-A (Ncol + BamHI cut) 0.5 ill 

H 2 0 5 ^| 



Alternatively, the fragments in NZ annealing mixes 1, 2, and 3 can be ligated in absence 
of vector and purified by agarose gel electrophoresis before being ligated into the Ncol- 
5 BamHI cut vector. The annealed and ligated oligo nucleotides from annealing mixes 1-3 
had single stranded terminal overhangs that were compatible with the overhangs that 
were generated by Ncol and BamHI restriction digestion of pTrcHis-A. After ligation of the 
fragment into cut pTrcHis-A, the Ncol and BamHI sites were regenerated. 

Following ligation into the vector, 2 jil of the ligation mixture was transformed into 50 \i\ of 
10 One Shot TOP10 chemically competent E. coli cells (Invitrogen) following the 
manufacturers protocol. The ligation can be performed using different amounts or 
volumes fragments and buffers. The inserted DNA sequence and the encoded NZ zipper 
peptide are as follows: 

MAGGTGSGALKKELQANKKE 
15 CCATGG CCGGTGG TACCGGTT CCGGTGCCCTGAAGAAGGAGCTGCAGGCCAACAAGAAGGAG 

LAQ L KW E LQAL KKE LAQ * D 
CTGGCCCAGCTGAAGTGGGAGCTGCAGGCCCTGAAGAAGGAGCTGGCCCAGTAGGATCC 



20 The Gly-Gly-Thr-Gly-Ser-Gly amino acid sequence in the terminus is part of the linker 
sequence that was inserted between the NZ zipper peptide and the NEGFP fragment (N- 
terminal fragments of EGFP are called NEGFP). The zipper sequence in the NZ-NEGFP 
fusion protein is also Gly-Gly-Thr-Gly-Ser-Gly with the Gly-Gly-Thr-Gly coding sequence 
being repeated in the NEGFP reverse amplification primers 2129, 2130, and 2131 (Table 

25 3). Underlined are the unique Ncol (ccatgg), Agel (ACCGGT) and BamHI (ggatcc) sites 
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used for cloning of the zipper peptide into pTrcHis-A and the NZ-NEGFP fragments into 
the NZ zipper vector PS1515 (see below). The asterisk (*) shows a stop codon. 

Ligation and iransfonnation of annealed CZ oligo pairs 

Each of the three CZ annealing mixtures 4-6 was diluted 50-fold (1 \x\ of mixture in 50 jil of 
5 H 2 0) and mixed as follows: 

Ligation of CZ zipper fragments into pTrcHis-A vector 



CZ annealing mix 1 1 

CZ annealing mix 2 1 

CZ annealing mix 3 1 ^ 

10x T4 DNA ligase buffer (New England Biolabs) 1 p| 

T4 DNA ligase (400 U/^i, New England Biolabs) 0.5 \x\ 

pTrcHis-A (Ncol + BamHI cut) 0.5 \i\ 

H 2 Q 5 ^| 



Alternatively, the fragments in CZ annealing mixes 1, 2, and 3 can be ligated in absence 
of vector and purified by agarose gel electrophoresis before being ligated into the Ncol- 
1 0 BamHI cut vector. The annealed and ligated oligo nucleotides from annealing mixes 1-3 
had single stranded terminal overhangs that were compatible with the overhangs that 
were generated by Ncol and BamHI restriction digestion of pTrcHis-A. After ligation of the 
fragment into cut pTrcHis-A, the Ncol and BamHI sites were regenerated. 

Following ligation into the vector, 2 pi of the ligation mixture were transformation into 50 jjJ 
1 5 of One Shot TOP1 0 chemically competent E. coli cells (Invitrogen) following the 
manufacturers protocol. The ligation can be performed using different amounts or 
volumes fragments and buffers. The inserted DNA sequence and the encoded CZ zipper 
peptide are as follows: 

MAS EQLEKKLQALEKKLAQL 
20 CCATGGCGAGCGAGGAGCTGGAGAAGAAGCTGCAOT 

EWKNQALEKKLAQGGTG* 
GAGTGGAAGAACCAGGCCCTGGAGAAGAAGCTGGCCCAGGGCGGCACCGGTTAGGATCC 
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The Gly-Gly-Thr-Gly amino acid sequence in the terminus is part of the linker sequence 
that was inserted between the CZ zipper peptide and the CEGFP fragment (C-terminal 
fragments of EGFP are called CEGFP). The zipper sequence in the CZ-CEGFP fusion 
protein is also Gly-Gly-Thr-Gly with the Thr-Gly coding sequence being repeated in the 
5 CEGFP forward amplification primers 21 33, 21 34, and 21 35 (Table 3). Underlined are the 
unique Ncol (ccatgg), Agel (accggt) and BamHI (ggatcc) sites used for cloning of the 
zipper peptide into pTrcHis-A and the CZ-CEGFP fragments into the CZ zipper vector 
PS1516 (see below). The asterisk (*) shows a stop codon. 

Example 3: E. coli colony PCR screen, plasmid miniprep and DNA 
10 sequencing 

The transformed bacteria were plated on Luria Broth (LB) agar plates containing 100 
ug/ml of carbenicillin as selection (present in used E. coli media). To quickly identify 
transformants containing the desired NZ or CZ constructs, colony PCR screening was 
performed using oligos 21 1 0 (5' forward NZ oligo) and 21 1 5 (3' reverse NZ oligo) or using 
15 oligos 21 16 (5' forward CZ oligo) and 2121 (3' reverse CZ oligo): 

Per sample (15 ul reaction volume) 

1 0x Taq polymerase buffer (Perkin Elmer) 1 .5 u | 



dNTP (5 mM nucleotide, each) 0.3 jil 

50 mM MgC! 2 0.6 ul 

Dimethyl sulphoxide (DMSO) 0.3 ul 

Taq polymerase (Perkin Elmer) 0.2 ul 

5' forward primer (1 0 uM) 0.5 ul 

3' reverse primer ( 1 0 uM ) 0.5 ul 

H2O 6.1 ul 

Transformant resuspended in H2O 5.0 ul 
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Cycling parameters ( RoboCvcler Gradient 96. Strataq ftn*) 
Initial denaturation at 94°C for 3 min followed by 25 cycles of (all steps of 1 min): 
Denaturation at 94°C, primer annealing at 53°C and primer extension at 72°C. 
Finally, an additional extension step at 72°C was included (5 min). 
5 16 NZ transformants and 16 CZ transformants were screened. PCR fragments having the 
expected product sizes of about 120 base pairs were amplified from 14 NZ clones and 15 
CZ clones, as determined by agarose gel electrophoresis analysis. 

Three of the positive colonies were picked from each transformation (NZ and CZ) and 
used to inoculate 5 ml of liquid LB medium. After culturing at 37°C over night, plasmid 
10 DNA was purified by mini preparations using the QIAprep kit from Qiagen. 

Plasmids containing correct NZ (PS1515) or CZ (PS1516) fragment inserts were identified 
by DNA sequencing on an ABI PRISM model 377 DNA sequencer using forward 
sequencing primer 1282. 



Example 4: Prokaryotic expression vectors encoding EGFP fragment/zipper 
15 fusion proteins 

The DNA sequences encoding the NZ and CZ zippers in the prokaryotic expression 
vectors PS1515 and PS1516. respectively, can be fused to DNA sequences encoding 
desired EGFP fragments (N-terminal fragments of EGFP are called NEGFP and C- 
terminal fragments of EGFP are called CEGFP) or other fragments using the unique Agel 
20 restriction sites appropriately located in linker sequences in either the 5' end (as in the NZ 
vector PS1515) or in the 3' end (as in the CZ vector PS1516) of the leucine zipper coding 
sequence in combination with either of the unique Ncol or BamHI sites used for cloning 
the zipper coding fragments (DNA and amino acid sequences are shown above). The 
general structures of the fusion protein coding sequences are shown in Figure i. 

25 For example, to prepare a prokaryotic expression vector encoding a fusion protein 
consisting of NZ zipper N-terminally fused to an NEGFP fragment, e.g. residues 1-172. 
this region of the EGFP coding sequence in the commercial expression vector pEGFP-C1 
(Clontech) was amplified by PCR using forward oligo 2128 (containing a unique Ncol site) 
and reverse oligo 2131 (containing a unique Agel site) in accordance with Table 3. 
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rer sample lou ui reaction volume j 




10x Pfu polymerase buffer (Stratagene) 


5.0 nl 


dNTP (5 mM nucleotide, each) 


1.0 pi 


Pfu Hot Start polymerase (Stratagene) 


1.0 pi 


5' forward primer (10 nM) 


1.0 pi 


3' reverse primer (10 pM) 


1.0 pi 


pEGFP-d vector ( 1 0 ng/pl) 


2.0 pi 


H z O 


39.0 pi 



Cycling parameters (Hvbaid OmniGene PGR machine) 

Initial denaturation at 94°C for 3 min followed by 25 cycles of (all steps of 1 min): 
5 Denaturation at 94°C, primer annealing at 53°C and primer extension at 72°C. 
Finally, an additional extension step at 72°C was included (5 min). 

The PGR fragment encoding the desired EGFP fragment, e.g. the above mentioned 
fragment composed of residues 1-172, with appropriately engineered terminal restriction 
10 sites contained in the primer sequences was then gel purified as described above cut with 
Ncol and Agel or Agel and BamHI and ligated into the constructed NZ or CZ prokaryotic 
leucine zipper expression vectors PS1515 or PS1516 cut with the same enzymes and gel 
purified: 

Restriction digestion of NEGFP and CEGFP PCR fragments 



EGFP fragment (gel purified) 26 ul 

Ncol (10 U/nl) or BamHI (20 U/nO 0.5 *il 

Agel (10 U/M) 1.0 |il 

10x New England Biolabs buffer 2 3 
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Restriction digestion of NZ (PS1515) and CZ (PS1516) vectors 



Vector (1 ng/nl) 1.0(J 

Ncol (1 0 U/nl) or BamHI (20 U/^l) 0.33 »l 

Agel(10U/ul) 0.66 yA 

1 0x New England Biolabs buffer 2 1 




All enzymes were from New England Biolabs. The DNA preparations were digested for 1 
hour at 37°C and gel purified. 

5 Ligation of EGFP fragments into cut PS1 51 5 or PS1 51 6 vector 



Ligation proceeded for 30 min at 22°C after which 2 |xl of each ligation mixture were 
transformed into 50 ul of One Shot TOP10 chemically competent E. coli cells (Invitrogen). 
The transformed cells were plated on LB plates containing carbenicillin and plasmids were 
1 0 prepared from two colonies from each transformation as described above. 

Example 5: EGFP based bimolecular fluorescence complementation in E. 
coli 

Plasmids that expressed functional NZ-NEGFP or CZ-CEGFP complementation 
constructs were identified by co-transforming 10 uJ of One Shot TOP 10 chemically 
15 competent E. coli cells (Invitrogen) with 1 ul of each of appropriately matched NZ-NEGFP 
or CZ-CEGFP plasmids (i.e., plasmids that express EGFP fragments, said fragments are 
truncated after (NEGFP fragments) or before (CEGFP fragments) the same splitting site 
and plating the co-transformed cells on LB plates containing carbenicillin and 5 mM of 
isopropyl-B-thiogalactoside (IPTG). 



Cut and purified vector 2 ul 

Cut and purified NEGFP or CEGFP fragment 4 ul 

1 0x T4 DNA ligase buffer (New England Biolabs) 1 M | 

T4 DNA ligase (400 U/nl, New England Biolabs) 0.5 ul 

H?0 2.5 nl 
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The transformed cells were grown over night at 37°C. E. coli colonies that were green 
fluorescent because of EGFP based bimolecular fluorescence complementation were 
visible on the agar plate without magnification about 10-20 hours after transfection (the 
fluorescence developed further during storage of the plates at 5°C for one or more days) 
5 when illuminated with a blue light source (Fiberoptic-Heim LQ2600) and viewed through 
yellow filter glasses. 

Functional complementation was clearly visible in cells co-transformed with 
complementation constructs based on splits between either residues 157 and 158 or 
between residues 172 and 173 and the DNA sequences of expression vectors that 
10 produced functional NZ-NEGFP or CZ-CEGFP complementation fragments (named 

PS1594, PS1595, PS1596, PS1597, see Table 4) were verified by DNA sequencing using 
primer 1282 as previously described. 

Surprisingly, the E. coli colonies of cells co-transformed with the vectors expressing the 
EGFP complementation fragments with split in the Ile171-Ser175 loop (namely between 
15 residues 172 and 173, vectors PS1595 and PS1597) were significantly more fluorescent 
than the colonies of cells that were co-transformed with vectors expressing EGFP 
complementation fragments that were split in the Ala154-Gly160 loop (namely between 
residues 157 and 158, vectors PS1594 and PS1596). 

Functional complementation was not clearly visible in cells co-transformed with 
20 complementation constructs based on a split between residues 144 and 145. DNA 

sequencing confirmed that expression vectors PS1614 and PS1615 encoded the correct 
NZ-NEGFP and CZ-CEGFP complementation fragments, respectively. 

Example 6: Eukaryotic expression vectors encoding EGFP fragment/zipper 
fusion proteins 

25 Because of the low fluorescence signal produced by the complementation fragments 
based on the 144/145 split fragments, only the complementation fragments that were 
based on splits at residues 157/158 or 172/173 were transferred to a eukaryotic 
expression system to permit evaluation of fragment complementation in mammalian cells. 



NZ-NEGFP fragments in PS1596 and PS1597, and CZ-CEGFP fragments in PS1594 and 
30 PS1595, are flanked by an Ncol site 5' to the start codons and a BamHI site 3' to the stop 
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codons. The fragments were transferred as blunt-ended Ncol/BamHI fragments into 
mammalian expression vectors cut with Eco47III/BamHI. To select for stable expression 
of both an NZ-NEGFP and a CZ-CEGFP expressing piasmid, the expression vectors for 
NZ-NEGFP fragments and CZ-CEGFP fragments contain selection markers for 
5 neomycin/geneticin/G418 and zeocin, respectively. 

Plasmids PS1594. PS1595, PS1596, and PS1597 were cut with Ncol restriction enzyme, 
blunt-ended with Klenow fragment, gel purified, cut with BamHI and gel purified as 
described below. 



10 



Restriction digestion of NZ-NEGFP and CZ-CEGFP prokarvotic expression vectors 

PS1594, PS1595, PS1596, or PS1597 (1 |ig/jil) 1 M i 

Ncol (10 U/nl, from New England Biolabs) 1 *il 

10x buffer 4 (NEB) 3*il 

H z O 25 ix\ 



The plasmids were digested for about 1 hour at 37°C. 1 jxl of 1 mM dNTP mix and 1 unit 
of Klenow fragment (New England Biolabs) were added and the reactions were incubated 
30 minutes at room temperature. The linear piasmid fragments were purified by agarose 
gel electrophoresis and recovered from the gel using the QIAquick Gel Extraction kit (spin 
15 columns) from Qiagen and recovered in 50 jal of elution buffer. 5 fil BamHI buffer (New 
England Biolabs) and 10 units BamHI enzyme were added. The plasmids were digested 
for about 1 hour at 37°C. The desired piasmid fragments were purified by agarose gel 
electrophoresis and recovered from the gel using the QIAquick Gel Extraction kit (spin 
columns) from Qiagen and recovered in 50 |al of elution buffer. 



20 To stably co-express NZ-NEGFP and CZ-CEGFP fragments in the same mammalian cell, 
mammalian expression vectors carrying different selection markers were required. To 
obtain this, the kanamycin/neomycin selection marker on the expression vector pEGFP- 
C1 was replaced with a zeocin resistance marker resulting in the piasmid referred to as 
PS0609. 
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Replacement of kanamycin/neomycin marker on pEGFP-C1 with zeocin marker. 

pEGFP-C1 was digested with Avrll, which excises the kanamycin/neomycin selection 
marker, and following gel purification, the vector fragment was (igated with an 
approximately 0.5 kbp Avrll fragment encoding zeocin resistance. This fragment was 
5 isolated by PCR amplification of the zeocin selection marker on plasmid pZeoSV 
(Invitrogen) using primers 9655 and 9658 (see Table 2). Both primers contain Avrll 
cloning sites and flank the zeocin resistance gene on plasmid pZeoSV including its E. coli 
promoter. The top primer 9658 spans the Asel site at the beginning of zeocin, which can 
be used to determine the orientation of the Avrll insert relative to the SV40 promoter 
10 which drives resistance in mammalian cells. The resulting plasmid is referred to as 
PS0609. 

Plasmids pEGFP-C1 (Clontech) and its zeocin-resistant derivative PS0609 were cut with 
Eco47lll restriction enzyme, gel purified, cut with BamHI and gel purified as described 
below. These steps excise EGFP and leave the rest of the vectors intact. 

15 Restriction digestion of eukarvotic expression vectors 

pEGFP-C1 or PS0609 DNA (1 *ig/ni) 0.5 yt 
Eco47III (10 U/)Lil, from Promega) 1 \x\ 
1 0x buffer D (Promega) 3 \x\ 

H 2 0 25.5 nl 

The plasmids were digested for about 1 hour at 37 6 C. The linear plasmid fragments were 
purified by agarose gel electrophoresis and recovered from the gel using the QIAquick 
Gel Extraction kit (spin columns) from Qiagen and recovered in 50 fil of elution buffer. 5 
20 BamHI buffer (New England Biolabs) and 10 units BamHI enzyme were added. The 
plasmids were digested for about 1 hour at 37°C. The desired vector fragments were 
purified by agarose gel electrophoresis and recovered from the gel using the QIAquick 
Gel Extraction kit (spin columns) from Qiagen and recovered in 50 jjiI of elution buffer. 
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Ligation of NZ-NEGFP fragments into PEGFP-C1 and CZ-CEGFP fragments into PS0609 

Cut and purified vector fragment 1 

Cut and purified NZ-NEGFP or CZ-CEGFP fragment 3 \x\ 

10x T4 DNA ligase buffer (New England Biolabs) 1 ^| 

T4 DNA ligase (400 U/jil, New England Biolabs) 0.5 jil 

H2Q 5 Ml 



Ligation reactions were incubated at 16°C overnight 3 (ill were transformed into One Shot 
TOP10 chemically competent E. coli cells (Invitrogen) and transformants were selected on 
5 imMedia with kanamycin or imMedia with zeocin (both from Invitrogen) for pEGFP-C1 and 
PS0609 derivatives, respectively. 



4 transformants from each transformation plate were picked in imMedia medium with 
appropriate selection (kanamycin or zeocin) and grown at 37 degrees C for 6 hours. 
Plasmid DNA was isolated by the QIAprep spin column method (Qiagen) and analysed by 
10 restriction digests with Asel and Mlul. The DNA sequences of the inserts were finally 
verified by sequencing as described above. The resulting plasmids were named PS1557, 
PS1558, PS1559, and PS1560 (Table 4). 



Example 7: EGFP based bimolecular fluorescence complementation In 
mammalian cells 

15 To establish cells lines that express EGFP fragment/zipper fusion proteins, CHO-hIR cells 
were transfected with plasmid pairs resulting in two cell lines 1) CHO-hIR 
PS1559+PS1557, and 2) CHO-hIR PS1560+PS1558. The CHO-hIR cell line consists of 
CHO-K1 (ATCC CCL-61) cells that have been stably transfected with the human insulin 
receptor (hlR f GenBank Acc# M1 0051) as described in: Hansen, B. F., Danielsen, G. M. f 

20 Drejer, K. f Sorensen, A. R., Wiberg, F. C„ Klein, H. H., Lundemose, A. G. (1996) 
Sustained signalling from the insulin receptor after stimulation with insulin analogues 
exhibiting increased mitogenic potency. Biochem. J. Apr 1; 315 ( Pt 1):271-279. The 
selection marker for the vector is methotrexate (MTX). The hIR expression is very stable 
in the CHO-hIR cells, without selection pressure, because of the insulin-sensitivity of the 

25 cell line and a very stable phenotype can be maintained without selection pressure. 
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Stable cells were obtained by cell growth in selection medium containing Geneticin and 
Zeocin. 

CHO-hIR cells were transfected using Fugene (Roche) according to the manufacturer's 
instructions. The day after transfection, cells were examined for transient expression, split 
5 1:10 and exposed to selection medium (growth medium supplemented with 500 pg/mi 
geneticin (Invitrogen) and 1 mg/ml zeocin (Cayla). The cells lines were stable after 2-3 
weeks of culture in selection medium. 

The growth medium used was NUT.MIX F-12 (Ham's) with GLUTAMAX-1 
(Gibco/lnvitrogen) supplemented with 10% fetal bovine serum (JRH Biosciences) and 1% 
10 Penicillin-Streptomycin (10,000 lU/ml, Gibco/lnvitrogen). The CHO-hIR cells were cultured 
in growth medium, and split 1:4 to 1:16 twice a week according to standard cell culture 
protocols. The CHO-hIR PS1559+PS1557 and CHO-hIR PS1560+PS1558 were treated 
likewise, except that the growth medium was supplemented with 500 Mg/ml geneticin 
(Invitrogen) and 1 mg/ml zeocin (Cayla) at ail times. 

15 Images of three CHO cell lines separately transfected with pEGFP-C1 (expressing EGFP 
with a short C-terminal extension), PS1559 + PS1557 (expressing EGFP 
complementation fragments split at 157-158, CG157 + NG158) and with PS1560 + 
PS1558 (expressing EGFP complementation fragments split at 172-173, CG172 + 
CN173) were collected 1 day, 2 days and 10 days after transfection to assess the relative 

20 brightness of cells expressing the different complementation constructs. Images were 
collected on a Nikon Diaphot 300 equipped for epifluorescence work. Light source for 
epifluorescence was a Nikon 100W Hg arc lamp, coupled to the microscope through a 
custom quartz fibre illuminator (TILL Photonics GmbH, Planegg, Germany). Excitation 
light passed through a 450-490 nm bandpass filter (Delta Light and Optics, Lyngby, 

25 Denmark) and was directed to the specimen via a Chroma 721 00 505 nm cut-on dichroic 
mirror (Chroma Technology, Brattleboro, VT, USA). A x40 NA1.3 oil immersion lens was 
used for all images. Emitted light passed through a 540-550 bandpass filter (Chroma) to a 
Hammamatsu Orca ER camera. All images were collected with 50 millisecond exposure 
time, chosen to ensure non-saturation of images for even the brightest (EGFP- 

30 expressing) cells in each optical field (maximum pixel count <4095). Imaging software 
used to acquire images on this system was IPLab for Windows (Scanalytics, USA). 
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Presentation and analysis of images 

The microscope images were analysed using the ImageJ software package, the public 
domain image analysis software written by Wayne Rasband of the US National Institute of 
Health (http://rsb.info.nih.gov/ij7) and the data analysts was performed in Microsoft Excel. 
5 The images shown in Figure 2 are of fluorescent CHO-hIR cells co-transfected with 
different NZ-NEGFP (NG) and CZ-CEGFP (CG) expression vectors or transfected with 
pEGFP-CI. The images are scaled individually to visualise the cells and the fluorescence 
distribution within them. Because of this scaling, the relative fluorescence levels cannot be 
compared between the images. When the same images are scaled identically they appear 
10 as in Figure 3 and it is apparent that the cells that are transfected with complementation 
constructs that are based on a split between residues 172 and 173 are significantly more 
fluorescent than the cells that are transfected with complementation constructs that are 
based on a split between residues 157 and 158. However, the cells transfected with the 
pEGFP-C1 construct show significantly stronger fluorescence on day 2. 

15 The same images were analysed for background and maximum fluorescence intensities 
using the ImageJ software package (Figure 4). From the figure, it is clear that a split 
between residues 172 and 173, and probably anywhere else in this loop, is greatly 
superior to a split between residues 157 and 158 and probably also to splits anywhere 
else in this loop. 

20 Example 8: Eukaryotic expression vectors encoding EYFP and EYFP variant 
F64L fragment/zipper fusion proteins 
In the following examples, this nomenclature will be used: 

Y Fragment of EYFP 
G Fragment of EGFP 

N157 N-terminal fluorescent protein fragment C-terminally truncated by a split in or 

close to the loop between beta sheets 7 and 8 (residues 154-160), e.g. between 
residues 154 and 155 or between residues 157 and 158 in EGFP or EYFP. The 
fragment is fused to a leucine zipper sequence as described in Example 4. 

N172 N-terminal fluorescent protein fragment C-terminally truncated by a split in or 

dose to the loop between beta sheets 8 and 9 (residues 171-175), e.g. between 
residues 172 and 173 in EGFP or EYFP. The fragment is fused to a leucine 
zipper sequence as described in Example 4. 
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C158 C-terminal fluorescent protein fragment N-terminally truncated by a split in or 

close to the loop between beta sheets 7 and 8 (residues 154-160). e.g. between 
residues 154 and 155 or between residues 157 and 158. The fragment is fused to 
a leucine zipper sequence as described in Example 4. 

C173 C-terminal fluorescent protein fragment N-terminally truncated by a split in or 

close to the loop between beta sheets 8 and 9 (residues 171-175), e.g. between 
residues 172 and 173 in EGFP or EYFP. The fragment is fused to a leucine 
zipper sequence as described in Example 4. 

F64L Fluorescent protein variant containing a leucine residue in position 64, e.g. in 
place of a phenyl alanine residue. This residue is in position 65 in EGFP and 
EYFP because of an extra residue (Val2) in position 2. 

Mutagenesis of the eukaryotic NG expression vectors PS1559 (NG157) and PS1560 
(NG172) into the corresponding N-terminal EYFP (SEQ ID NO: 5) fragment (NY) variants 
and mutagenesis of the eukaryotic CG expression vectors PS1557 (CG158) and PS1558 
(CG173) into the corresponding C-terminal EYFP fragment (CY) variants was 
accomplished by site directed mutagenesis using the QuickChange kit and by following 
the manufactorers instructions (Stratagene). Primers 2333 and 2334 were used to convert 
expression vectors PS1559 (NG157) and PS1560 (NG172) into N-terminal EYFP 
fragment expression vectors PS1639 (NY157) and PS1642 (NY172). The Introduced 
mutations were: L64F:T65G:V68L:S72A. Furthermore, primers 2335 and 2336 were used 
to convert expression vectors PS1559 (NG157) and PS1560 (NG172) into F64L mutated 
N-terminal EYFP fragment expression vectors PS1640 (NY157 F64L) and PS1641 
(NY172 F64L). The introduced mutations were: T65G:V68L:S72A. Accordingly, the 
expressed YN fragments have the following amino acid sequences (only residues 64-72 
are shown): 

F71 S72 
F71 A72 
F71 A72 

Finally, primers 2337 and 2338 were used to convert expression vectors PS1557 (CG158) 
and PS1558 (CG173) into C-terminal EYFP fragment expression vectors PS1637 
(CY158) and PS1638 (NY173) by introducing a T203Y mutation. All sequences were 



10 



(template) L64 765 Y66 067 V68 Q69 070 

( N L64F:T65G:V68L:S72A) F64 665 Y66 G67 L68 069 070 
F64L(T65G:V68L:S72A) 164 G65 Y66 667 L68 Q69 C70 
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verified by DNA sequencing of the vectors and all primer sequences are shown in Table 

2. 

Example 9: EGFP based bimotecular fluorescence complementation in 
mammalian cells 

5 The constructed EYFP based split fluorescent protein expression vectors PS1637 to 
PS 1642 described above were investigated in mammalian cells in parallel with the EGFP 
based split fluorescent protein expression vectors PS1557 to PS1560 described in 
Example 7 and using the same experimental set-up (including the same filter set) and 
procedures (including the image analysis procedure) except that all images were 
10 produced using 10 ms exposure times instead of 50 ms exposure times, because of the 
increased brightness of the probes, and a 20x objective was used instead of a 40x 
objective to image more cells. Other appropriate filter sets could have been used. The 
images are taken the day after transfection (day 1). 

It is apparent from the identically scaled fluorescence images of the transfected cells 
15 (Figure 6) that the split site between residues 172 and 173 is again shown to be superior 
to the split site between residues 157 and 158. Furthermore, it is apparent that 
complementation based on EYFP fragments is superior to complementation based on 
EGFP fragments. Surprisingly, introduction of the F64L mutation from EGFP into the N~ 
terminal EYFP fragments further greatly enhanced the fluorescence of the complementing 
20 fragments. As can be seen from the images, the positive effects of using the optimal 
splitting site (between residues 172 and 173) using the optimal fluorescent protein colour 
variant (EYFP) and introducing the F64L fofding mutation into the NY fragment, are 
additive. Quantification of these observation was done by analysing the images shown in 
Figure 6 and the numeric out-put is presented in Figure 7. 



25 Effects of colour (yellow better): 



Good 




Better 


EGFP 


vs 


EYFP 


GN157 + GC158 


vs 


YN157 + YC158 


GN172 + GC173 


vs 


YN172 + YC173 
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Effects of split site (172/173 better): 



Good 




Better 


GN157 + GC158 


vs 


GN172 + GC173 


YN157+YC158 


vs 


YN172 + YC173 


GN157 F64L + GC158 


vs 


YN172F64L + YC173 



Effects of F64L (+F64L better): 



Good 


Better 


YN157 + YC158 


vs YN157F64L + YC158 


YN172 + YC173 


vs YN172 F64L + YC173 



5 It is interesting to note, that the optimal constructs (YN172 F64L + YC173) when re- 
assembled is nearly as intense as EYFP itself. The great increase in fluorescence 
intensity is important in many types of quantitative cell analyses (e.g. high through-put 
screening and microscopy) to increase the signal to noise rations, to facilitate detection of 
low amounts of probes in vivo or in vitro, etc. 

10 Mixing YN and GC or GN and YC fragments can also produce functional fluorescent 
complexes, potentially of different colours (Figs. 8 and 9). Fragments having overlapping 
sequences are also functional and may be very attractive in e.g. functional cloning 
systems where highly flexible linkers sequences are required due to the very diverse 
nature of the fusion partners. The overlapping fragments permit either of the fusion 

15 partners to have a long linker sequence (Figure 8, quantified in Figure 9). 
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Figure legends 

Figure 1 

General structures of the fusion protein coding sequences. 
Figure 2 

5 1 6 bit images of fluorescent CHO-hIR cells co-transfected with NZ-NEGFP and CZ- 
CEGFP expression vectors or transfected with pEGFP-C1 were taken and scaled 
individually to visualise the cells and the fluorescence distribution within them. Because of 
the pixel intensity scaling, the relative fluorescence levels cannot be compared among the 
images. The splitting sites are either at residues 157/158 (top row, plasmids PS1557 and 
1 0 PS1 559) or at residues 1 72/1 73 (middle row, plasmids PS1 558 and PS1 560). The EGFP 
expression vector pEGFP-C1 was transfected into the cells in the bottom row. The images 
were taken 1 day (left column), 2 days (middle column), or 10 days (right column) after 
transfection. The images of the cells are representative of the cells that expressed 
functionally complementing fragments. 

15 Figure 3 

The same 16 bit images of fluorescent CHO-hIR cells co-transfected with NZ-NEGFP and 
CZ-CEGFP expression vectors or transfected with pEGFP-C1 as shown in Figure 2 but 
the images are now shown with the same intensity scaling to allow comparison of 
fluorescence intensities. The cells that are transfected with complementation constructs 
20 that are based on a split between residues 172 and 173 (middle row) are clearly more 
fluorescent than the cells that are transfected with complementation constructs that are 
based on a split between residues 157 and 158 (top row). However, the cells transfected 
with the pEGFP-C1 construct (bottom row) show significantly stronger fluorescence at day 
2. 

25 Figure 4 

The un-manipulated microscope images shown in Figure 3 were analysed using the 
ImageJ software package and data analysis was performed in Microsoft Excel. For each 
16-bit monochrome IP Lab microscope image, pixel Intensity data were produced In 
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ImageJ and exported to an Excel spread-sheet for data analysis. The darkest and 
brightest 0.5% of the pixels were identified in each image and the average intensities of 
these two groups of pixels were calculated. The average intensity of the 0.5% darkest 
pixels was defined as the back ground fluorescence intensity (shown as white bars in the 
5 histogram) and the intensity of the 0.5% brightest pixels was defined as the maximum 
intensity. The difference in intensity between the maximum intensity and the background 
intensity was defined as the response (shown as cross hatched bars in the histogram). 
The sum of the background intensity and the response is equal to the maximum intensity. 
From the figure, it is clear that EGFP based fluorescence complementation using a split 
10 between residues 172 and 173, and probably anywhere else in this loop, is greatly 
superior to EGFP based fluorescence complementation using a split between residues 
157 and 158 and probably also to splits anywhere else in this loop. 

Figure 5 

Positions of appropriate fluorescent protein splitting sites are shown on ribbon and wire 
1 5 frame representations of GFP. The two representations show the same sites from two 
sides (molecule rotated approximately 180 degrees around a vertical axis). 

Figure 6 

Co-transfection of expression vectors expressing EGFP and EYFP based 
complementation fragments as described in Figure 3 to compare the abilities of the 
20 various complementation fragments to combine in cells and produce functional 

complexes. All images are scaled identically to allow direct comparison of fluorescence 
intensities between the images. 

Single transfections with N-terminal fragments only resulted in no detectable fluorescence 
above the back ground level (data not shown). These N-terminal fragments contains the 
25 chromophore. 

Figure 7 

Quantitative analysis of the images shown in Figure 6. The results are in accord with the 
impressions from visual inspection of the cells. The data were produced as described in 
the legend to Figure 4. 
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Figure 8 

Co-transfection of expression vectors expressing EGFP and EYFP based 
complementation fragments as described in Figure 3 to compare the effects of mixing 
differently coloured EGFP f EYFP and EYFP F64L fragments and to determine the 
5 influence of overlapping fragments, e.g. combining fragments encoding residues 1-172 
and 158-238. All colour combinations complement but typically less efficient than in the 
correct combinations. Fragments having overlapping regions are also functional and this 
may be avantagous in experiments where longer linker sequences are or may be required 
by the fusion partners due to steric hinderance. This was not the case in this experiments 
10 where the fusion partners are leucine zippers. In the example (middle column), residues 
158-172 were present in both fragments. In all situations, the F64L has a favourable effect 
on the fluorescence intensities. All images are scaled identically to allow direct 
comparison of fluorescence intensities between the images. 

Figure 9 

1 5 Quantitative analysis of the images shown in Figure 8. The results can be compared 
directly with the results shown in Figure 7 and they are in accord with the impressions 
from visual inspection of the cells. The data were produced as described in the legend to 
Figure 4. 
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Tables 



Table 2 Oligo nucleotides used in cloning. Oligo nucleotides beginning with P* are 
phosphorylated at the 5' end to permit ligation. 



Oligo nucleotide 


Oligo nucleotide sequence (5' end to 3' end) 


1282 


CAGACAATCTGTGTGGGCACTCGACCGfi 


2110 


P*C ATG G CCGGTGGTACCG GTTP PGfnTfSPPP Tf5 A A<^ A Af2 f5 Af*P"TY5P AG d 


2111 


P*AG PTPPTTnTTn AG Rfi PAPPrtrt a ap pf5 r^T a p p Ap rsrcp 


2112 


P^CAACAAGAAGGAGPTGGPrPAGPTGAAf^TniririA^rTrsPAG 


2113 


P*CTCCCACTTCAGCTGGGnnAGnTPPTTPTTGTTGGrrTGP 

i w i vvunu J l vnWw I uvjo\-/unuu 1 Mull VJJ 1 1 OVJvv 1 O w 


2114 


P*GCCCTGAAGAAGGAGPTGGP PP AGTAG 


2115 


P*GATCCTACTGGGCCAGCTPPTTPTTPAGGGPPTGPAG 

1 vjr*v i v*/V I /"\u 1 V7V7V7V>'vy/-\V30 I Mull vnOOOUw 1 OOnO 


2116 


P*CATGGCCAGCGAGPAGPTGGAGAAGAAGPTGPAGGPPPTG 


2117 


P*CCTG C AG CTTCTTCTCC AGCTG PTPG PTGGP 


2118 


P*GAGAAGAAGCTGGCPPAGPTGGAGTGGAAGAAPPAGGPPPTGGAG 


2119 


P*G G CCTG GTTCTTCC ACTCCAGCTGGGCC AGCTTPTTPTPC AGGG 


2120 


P*AAGAAGCTGGCCCAGGGCGGCACCGGTTAG 




P GATCCTAACCGGTGCCGCCCTGGGCCAGCTTCn 


2128 


GGCGCCATGGTGAGCAAGGGCGAG 


2129 


GCCGGACCGGTACCACCGTTGTACTCCAGCTTGTG 


2130 


GCCGGACCGGTACCACCCTGCTTGTCGGCCATG 


2131 


GPPGGACCGGTAPPAPPPTPGATrVTTfVn'SfSPttGATP 


2132 


CCCCGGATCCTACTTGTACAGCTCGTCCATGC 


2133 


GGCGCCATGGGCACCGGTTACAACAGCCACAACGTC 


2134 


GGCGCCATGGGCACCGGTAAGAACGGCATCAAGGTG 


2135 


GGCGCCATGGGCACCGGTGACGGCAGCGTGCAGCTC 


2333 


GCCCACCCTCGTGACCACCTTCGGCTACGGCCTGCAGTGCTTCGCCCGC 




TACCCCGACCACATG 




CATGTGGTCGGGGTAGCGGGCGAAGCACTGCAGGCCGTAGCCGAAGGT 




GGTCACGAGGGTGGGC 


2335 


GCCCACCCTCGTGACCACCCTGGGCTACGGCCTGCAGTGCTTCGCCCGC 




TACCCCGACCACATG 


2336 


CATGTGGTCGGGGTAGCGGGCGAAGCACTGCAGGCCGTAGCCCAGGGT 




GGTCACGAGGGTGGGC 


2337 


GACAACCACTACCTGAGCTACCAGTCCGCCCTGAGC 


2338 


GCTCAGGGCGGACTGGTAGCTCAGGTAGTGGTTGTC 


9655 


TCCTAGGTCAGTCCTGCTCCTCGGCCACGAAGTGCAC 




TCCTAGGCTGCAGCACGTGTTGACAATTAATCATCGG 


9658 


CAGACAATCTGTGTGGGCACTCGACCGG 
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Table 3 Primer pairs used in EGFP fragment amplification 



Protein encoded by PCR fragment 


5* primer 


3* primer 


EGFP(1-144) 


2128 


2129 


EGFP(1-157) 


2128 


2130 


EGFPO-172) 


2128 


2131 


EGFP(145-238) 


2133 


2132 


EGFP(1 58-238) 


2134 


2132 


EGFP(1 73-238) 


2135 


2132 



Table 4 Cloning and expression vectors 



Vector 


Expressed protein 


Promotor 


Selection 
E.coli/mamm. 


PEGFP-C1 


EGFP 


CMV 


kan/neo 


PS0609 


EGFP 


CMV 


zeo/zeo 


pTrcHis-A 


no insert 


Trc 


amp/none 


PS1515 


NZ leucine zipper 


Trc 


amp/none 


PS1516 


CZ leucine zipper 


Trc 


amp/none 


PS1614 


NZ-NEGFP(1-144) 


Trc 


amp/none 


PS1596 


NZ-NEGFP{1-157) 


Trc 


amp/none 


PS1597 


NZ-NEGFP(1-172) 


Trc 


amp/none 


PS1615 


CZ-CEGFP(1 45-238) 


Trc 


amp/none 


PS1594 


CZ-CEGFP(1 58-238) 


Trc 


amp/none 


PS1595 


CZ-CEGFP( 173-238) 


Trc 


amp/none 


PS1559 


NZ-NEGFP(1-157) 


CMV 


kan/neo 


PS1560 


NZ-NEGFP(1-172) 


CMV 


kan/neo 


PS1557 


CZ-CEGFP(1 58-238) 


CMV 


zeo/zeo 


PS1558 


CZ-CEGFP(1 73-238) 


CMV 


zeo/zeo 
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Sequences 



Modtaget PVS 
- 2 JUU 2002 



SEQ ID 1 Amino acid sequence of GFP 

MSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTT 
GKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFF 
5 KDDGI^KTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNV 
YIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHY 
LSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK 



SEQ ID 2 Amino acid sequence of GFP Y66W 

1 0 MSKGEELFTGWPILVELDGDWGHKFSVSGEGEGDATYGKLTLKFICTT 
GKL PVPWPTLVTTFS WGVQCFS RYPDHMKQHDFFKSAMPEGYVQERTI FF 
KDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNV 
Y IMADKQKNG I KVNFKI RHNI EDGSVQIADH YQQNT P I GDG P VLL PDNHY 
LSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK 



SEQ ID 3 Amino acid sequence of GFP Y66H 

MSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTT 
GKLPVPWPTLVTTFSHGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFF 
KDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNV 
20 YIMADKQKNGIKVNFKIRHNIEDGSVQIADHYQQNTPIGDGPVLLPDNHY 
LSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK 



SEQ ID 4 Amino acid sequence of EGFP 

MVSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICT 
25 TGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIF 
FKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHN 
VYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNH 
YLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYK 



30 SEQ ID 5 Amino acid sequence of EYFP 

MVSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICT 
TGKLPVPWPTLVTTFGYGLQCFARYPDHMKQHDFFKSAMPEGYVQERTIF 
FKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHN 
VYIMADKQKNGIKVNFKIRHNIEDGSVQ1ADHYQQNTPIGDGPVLLPDNH 
35 YLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYK 



15 



SEQ ID 6 Amino acid sequence of EYFP F64L variant 

MVSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICT 
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TGKLPVPWPTLVTTLOYGLQCPARYPDHMKQHDFFKSAMPEGYVQERTIF 
FKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHN 
VYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNH 
YLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYK 
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Claims 



Modtaget PVS 
-2 JUL? 2002 



1 . Two GFP fragments comprising 

(a) an N-terminal fragment of GFP. comprising a continuous stretch of amino acids from 
amino acid number 1 to amino acid number X of GFP, wherein the peptide bond between 

5 amino acid number X and amino acid number X+1 is within a loop of GFP and 

(b) a C-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
amino acid number X+1 to amino acid number 238 of GFP. 

2. Two GFP fragments comprising 

(a) an N-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
10 amino acid number 1 to amino acid number X of GFP, wherein the peptide bond between 

amino acid number X and amino acid number X+1 is within a loop of GFP and 

(b) a C-terminal fragment of GFP, comprising a continuous stretch of amino acids from 
amino acid number Y to amino acid number 238 of GFP, wherein Y<X creating an overlap 
of the two GFP fragments, and wherein the peptide bond between amino acid Y-1 and 

15 amino acid Y is within a loop of GFP. 

3. Two GFP fragments according to the preceding claim, wherein GFP is selected from 
the group consisting of EGFP, EYFP, ECFP, dsRed and Renilla GFP. 

4. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EGFP. 

20 5. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EYFP. 

6. Two GFP fragments according to any of the preceding claims, wherein the amino acid 
in position 1 preceding the chromophore has been mutated to provide an increase of 
fluorescence intensity. 

25 7. Two GFP fragments according to the preceding claim, wherein the amino acid F in 
position 1 preceding the chromophore has been substituted by L. 



8. Two GFP fragments according to any of the preceding claims, wherein the GFP has 
been mutated to further contain the S72A mutation. 
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9. Two GFP fragments according to any of the preceding claims, wherein X is between 9 
and 10 within the Thr9-Val1 1 loop; or between 23 and 24 within the Asn23-His25 loop; or 
between 38 and 39 within the Thr38-Gly40 loop; or between 48 and 55 within the Cys48- 
Pro56 loop; or between 72 and 75 within the Ser72-Asp76 loop; or between 81 and 82 

5 within the His81-Phe83 loop; or between 88 and 89 within the Met88-Glu90 loop; between 
101 and 102 within the Lys101-Asp103 loop; or between 114and 117withinthe Phe114- 
Thr118 loop; or between 128 and 144 within the He 128-Tyr145 loop; or between 154 and 
159 within the Ala154-Gly160 loop; or between 171 and 174 within the He171-Ser175 
loop; or between 188 and 196 within the Ile188-Asp197 loop; or between 210 and 214 
1 0 within the Asp2 1 0-Art2 1 5 loop. 

10. Two GFP fragments according to the preceding claim, wherein X is between 154 and 
159 within the Ala154-G!y160 loop. 

11. Two GFP fragments according to the preceding claim, wherein X is 157 within the 
Ala154-Gly 160 loop. 

15 12. Two GFP fragments according to the preceding claim, wherein X is between 171 and 
174 within the Ile171-Ser175 loop. 

13. Two GFP fragments according to any of the preceding claims, wherein X is 172 within 
in ile171-Ser175 loop. 

14. Two GFP fragments according to the preceding claim, wherein Y is between 154 and 
20 1 59 within the Ala1 54-Gly1 60 loop. 

15. Two GFP fragments according to the preceding claim, wherein Y is 157 within the 
Ala154-Gly160 loop. 

16. Two GFP fragments according to any of the preceding claims, wherein the N-terminal 
fragment of GFP is fused in frame with a first protein of interest. 

25 17. Two GFP fragments according to any of the preceding claims, wherein the first protein 
of Interest is fused to the N-terminal of the N-terminal fragment of GFP 
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18. Two GFP fragments according to any of the preceding claims, wherein the first protein 
of interest is fused to the C-terminal of the N-terminal fragment of GFP. 

19. Two GFP fragments according to any of the preceding ciaims, wherein the C-terminal 
fragment of GFP is fused in frame with a second protein of interest. 

5 20. Two GFP fragments according to any of the preceding claims, wherein the second 
protein of interest is fused to the N-terminal of the C-terminal fragment of GFP. 

21 . Two GFP fragments according to any of the preceding claims, wherein the second 
protein of interest is fused to the C-terminal of the C-terminal fragment of GFP. 

22. Two GFP fragments according to any of the preceding claims, wherein the N-terminal 
10 fragment of GFP fused in frame to a first protein of interest further comprises a linker 

sequence between the N-terminal fragment of GFP and the first protein of interest. 

23. Two GFP fragments according to any of the preceding claims, wherein the C-terminal 
fragment of GFP fused in frame to a second protein of interest further comprises a linker 
sequence between the C-terminal fragment of GFP and the second protein of interest 

1 5 24. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EYFP further containing an F64L mutation, wherein X is 172, wherein the first protein of 
interest fused to the N-terminal fragment of GFP is fused to the C-terminal of the N- 
terminal fragment of GFP and wherein the second protein of interest fused to the C- 
terminal fragment of GFP is fused to the N-terminal of the C-terminal fragment of GFP. 

20 25. Two GFP fragments according to any of the preceding claims, wherein the GFP is 
EYFP further containing an F64L mutation, wherein X is 157, wherein the first protein of 
interest fused to the N-terminal fragment of GFP is fused to the C-terminal of the N- 
terminal fragment of GFP and wherein the second protein of interest fused to the C- 
terminal fragment of GFP is fused to the N-terminal of the C-terminal fragment of GFP. 

25 26. The N-terminal fragment of GFP according to any of the preceding claims. 



27. The C-terminal fragment of GFP according to any of the preceding claims. 



1016DK3 




Page 43 of 44 



28. Nucleic acid encoding a fragment according to any of the preceding claims. 

29. A cell comprising an N-terminal fragment of GFP according to any of the preceding 
claims. 

30. A cell comprising a C-terminal fragment of GFP according to any of the preceding 
5 claims. 

31. A cell comprising the two GFP fragments according to any of the preceding claims. 

32. A vector comprising the two GFP fragments according to any of the preceding claims. 

33. A vector comprising the N-terminal fragment of GFP according to any of the preceding 
claims. 

10 34. A vector comprising the C-terminal fragment of GFP according to any of the preceding 
claims. 

35. A plasmid comprising the two GFP fragments according to any of the preceding 
claims. 

36. A plasmid comprising the N-terminal fragment of GFP according to any of the 
15 preceding claims. 

37. A plasmid comprising the C-terminal fragment of GFP according to any of the 
preceding claims. 

38. A method for detecting the interaction between two proteins of interest comprising the 
steps of: 

20 (a) providing at least one cell that contains two heterologous conjugates, 

the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP according to any of the preceding claims, 
the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP according to any of the preceding claims; and 

25 (b) measuring the fluorescence from the at least one cell, 

fluorescent cells indicating interaction between the two proteins of interest 
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39. A method for monitoring the interaction between two proteins of interest comprising 
the steps of: 

(a) providing at least one cell containing at least one stretch of nucleic acid encoding for 
two heterologous conjugates, 

5 the first heterologous conjugate comprising a first protein of interest conjugated to an 
N-terminal fragment of GFP according to any of the preceding claims, 
the second heterologous conjugate comprising a second protein of interest conjugated 
to a C-terminal fragment of GFP according to any of the preceding claims; 

(b) culturing the at least one cell under conditions allowing expression; and 
10 (c) measuring the fluorescence from the at least one cell, 

fluorescent ceils indicating interaction between the two proteins of interest. 

40. A method according to any of the preceding claims, wherein the at least one cell is a 
mammalian cell. 
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